50 research outputs found

    Crowd4U: An Initiative for Constructing an Open Academic Crowdsourcing Network

    Get PDF
    International audienceWe describe the Crowd4U initiative, which aims at constructing an all-academic open and generic platform for microvolunteering and crowdsourcing worldwide. Crowd4U provides a microtask-based platform in which most workers are volunteers at universities and other research institutions. Crowd4U is open in the sense that the platform can interact with other platforms, researchers can register their tasks, and the underlying code is not a black box. It is generic as it allows to register virtually any task. Crowd4U has already been used by several projects for public and academic purposes

    Human-assisted OCR of Japanese books with different kinds of microtasks

    Get PDF
    Human-assisted OCR is a common approach for transcribing books and has been used for many digital library projects. This paper reports our project for transcribing the book collections of National Diet Library in this approach. Our project is unique in two ways. First, we try to extend the human-assisted OCR approach by distributing microtasks in many ways other than just showing tasks in the specific Web page on PC screens. Second, we deal with Japanese books which have thousands of characters, some of which look similar to each other. This paper shows that we can expect high-quality results even if we transcribe Japanese texts with microtasks and the number of preformed microtasks to be stable if we distribute microtasks to equipment with witch worker perform microtasks in their daily lives

    Platform Design for Crowdsourcing and Future of Work

    Get PDF
    International audienceOnline job platforms have proliferated in the last few years. We anticipate a future where there exists thousands of such platforms covering wide swathes of tasks. These include crowdsourcing platforms such as Amazon Mechanical Turk (AMT), CrowdWorks, Figure Eight; specialized services such as ridehailing; matching markets such as TaskRabbit that matches workers with local demand and so on. It is widely anticipated that a vast majority of human workforce will be employed in these platforms. In this article, we initiate discussions about the under studied aspect of platform design-how to design platforms that maximize the satisfaction of various stakeholders. We also contribute a novel taxonomy for platform ecosystems that categorizes existing and emerging platforms. Finally, we discuss the need for interoperability between these platforms so that workers and requesters are not tied to a single platform

    A Crowdsourcing Approach for Finding Misidentifications of Bibliographic Records

    Get PDF
    Because there is no perfect technique for automatic identification of bibliographic records, cleaning the identification results manually is indispensable. However, to recruit human resources for the task is often difficult. This paper discusses a microtask-based crowdsourcing approach to the problem. An important issue is to design a good strategy for generating tasks to be assigned to workers, maintaining the quality and reducing the number of tasks. In this study, we explore a design space defined by two criteria to reduce the number of assigned microtasks for finding misidentifications caused by automatic identification techniques. We compare four task-generation strategies using bibliographic records of the National Diet Library. One of the strategies reduced 55.7% of tasks from the baseline strategy and statistic analysis showed that the quality of its result is comparable to those of the other three strategies.published or submitted for publicationis peer reviewe

    Proactive Preservation of World Heritage by Crowdsourcing and 3D Reconstruction Technology

    Get PDF
    Since over one million tourists annually visit the Angkor ruins, the effect on the buildings from the vibrations caused by these tourists is a huge problem for maintaining them. Such organisms as bryophytes, which adhere to the surface of the stones of the ruins, is another factor that damages them. Using crowdsourcing and 3D reconstruction technology, we are organizing a proactive preservation project for the Angkor Thom Bayon Temple, which is a world cultural heritage site. We evaluated its damaged parts and visualized the damaged state.Published in: 2017 IEEE International Conference on Big Data (Big Data) Date of Conference: 11-14 Dec. 2017 Conference Location: Boston, MA, US

    A System for Worldwide COVID-19 Information Aggregation

    Full text link
    The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-19 information aggregation (http://lotus.kuee.kyoto-u.ac.jp/NLPforCOVID-19 ) containing reliable articles from 10 regions in 7 languages sorted by topics for Japanese citizens. Our reliable COVID-19 related website dataset collected through crowdsourcing ensures the quality of the articles. A neural machine translation module translates articles in other languages into Japanese. A BERT-based topic-classifier trained on an article-topic pair dataset helps users find their interested information efficiently by putting articles into different categories.Comment: Poster on NLP COVID-19 Workshop at ACL 2020, 4 pages, 3 figures, 7 table

    SMART: A Tool for Semantic-Driven Creation of Complex XML Mappings

    No full text
    We focus on the problem of data transformations, i.e., how to transform data to another structure to adapt it to new application requirements or given environments. Here, we define data transformation as the process of taking as input two schemas A and

    Human+AI Crowd Task Assignment Considering Result Quality Requirements

    No full text
    This paper addresses the problem of dynamically assigning tasks to a crowd consisting of AI and human workers. Currently, crowdsourcing the creation of AI programs is a common practice. To apply such kinds of AI programs to the set of tasks, we often take the ``all-or-nothing'' approach that waits for the AI to be good enough. However, this approach may prevent us from exploiting the answers provided by the AI until the process is completed, and also prevents the exploration of different AI candidates. Therefore, integrating the created AI, both with other AIs and human computation, to obtain a more efficient human-AI team is not trivial. In this paper, we propose a method that addresses these issues by adopting a ``divide-and-conquer'' strategy for AI worker evaluation. Here, the assignment is optimal when the number of task assignments to humans is minimal, as long as the final results satisfy a given quality requirement. This paper presents some theoretical analyses of the proposed method and an extensive set of experiments conducted with open benchmarks and real-world datasets. The results show that the algorithm can assign many more tasks than the baselines to AI when it is difficult for AIs to satisfy the quality requirement for the whole set of tasks. They also show that it can flexibly change the number of tasks assigned to multiple AI workers in accordance with the performance of the available AI workers

    Efficient Evaluation of XML Middle-ware Queries

    No full text
    We address the problem of e#ciently constructing materialized XML views of relational databases. In our setting, the XML view is specified by a query in the declarative query language of a middle-ware system, called SilkRoute. The middle-ware system evaluates a query by sending one or more SQL queries to the target relational database, integrating the resulting tuple streams, and adding the XML tags. We focus on how to best choose the SQL queries, without having control over the target RDBMS. 1
    corecore